1. Import Basics library

2. Reading in the Dataset , Data Preparation and Cleaning.

Data understanding

Before we start any analysis we need to understand the Business and the data side: Data Understanding:

In here we can see there are 3 columns in the dataset The dataset contains date, text and a NaN value

Using the info(), we are able to know the row of each object is not balance because there are 21k message but some of the columns only have 23k and 700

After knowing there is an unknown value in the dataset and inbalance row we now can

Lets get started on the Exploratory Data Analysis(EDA)

Question 1: Which users have the most Chat/messages in the group?

In any WhatsApp analysis, we always want to know which user normally chat the most in the group. This help as we determine the most active person in the chat group. Making use of Pandas : As you can see we can use pandas to understand the data. Now we will be able to see that the person who has sent the most messages in the group is “Babajide Odusami”

Data Visualization:

We are going to use the plot and bar chart for our data visualization. As you can see the results have shown us the most number of messages is by users call “Babajide Odusami” that is around 140k and this show “Babajide Odusami” is a very active member in the group

Question 2: Which emojis are used the most and by which users?

Now we want to know which emoji is use widely by the user and from the analysis we can do an assumption that user will mostly likely to use emoji again in the other chat. First we need to count the number of emoji in the mmessage row by using the UNICODE_EMOJI to search the code for the emoji

As you can see when already rendered the emoji from the whatsapp_df3 and also successfully put in the dataframe table.Now all we need to do is just put in the Pie chart as our data visualization.

Question 4: Determine which word or text did the user use the most?

In here we are going to use a word cloud to visual representation of word in the chat and determine which word is widely use by the user? The reason behide this analysis is to understand the user behaviors. Why do we say so? Because if the word is repeating use we can say that the user will more likely to use the particular or text again in the other chat.

Data Visualization:

As you can see the the word used the most is 'guy'.

Question 3:The Most active hour in WhatsApp Groupchat

In this analysis it help us to understand what is the hours where all the member are very active in whatsapp. We will depend on two variable on is the number of messages and the hours. Then we will able to know when is the most active hours.

In this results we are able to know the most active time in whatsapp chat group is 2100hrs that is 9pm. The reason behide it is because during that time the group members are back from work.

Question 5: Which month have the highest messages and also the busiest month?

This group was has existed between (07/05/2019 - 14/06/2021).